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1.0  Executive  Summary 

This  is  a report  of  the  analysis  of  the  U.S.  Navy  Aircrew  Automated  Es- 
cape Systems  (AAES)  project  from  15  May  1976  to  15  August  1976.  It  addresses 
the  following  four  main  areas:  (a)  Brief  survey  of  the  analytical  techniques 

used,  (2)  Reasons  for  using  the  given  techniques,  (3)  Summary  of  results  ob- 
tained, and  (4)  Suggestions  for  future  investigation. 

1 . 1 Analytical  Techniques  Used 

Primary  emphasis  for  this  study  was  the  application  of  well  known 
statistical  and  deterministic  methods  to  ejection  related  personnel  injury 
data.  A flow  diagram  of  analyses  used  and  a general  sequence  in  which  they 
were  applied,  as  detailed  in  this  report,  is  shown  in  Figure  2-3.  Techniques 
used  include  the  following: 


• Two-Way  Analysis  of  Variance 

• Non-Parametric  Trend  Test 

• Non-Parametric  Run  Test 

• Deterministic  Analyses 

Numerical  Techniques 
Deterministic  Prediction 

• Hypothesis  Testing  using  the  following  density  functions: 

Runs  Discrete  Density  Function 
Normal  Density  Function 
Gamma  Density  Function 
Exponential  Density  Function 

• Chi-Square  Density  Function  was  used  to  test  Goodness-of-Fit  of 
random  samples  to  the  preceding  parent  probability  density 
functions 


Application  of  these  techniques,  as  well  as  other  techniques,  are 
detailed  in  our  Phase  I Final  Technical  Report  entitled,  "Statistical  Anal- 
ysis Methodology  for  the  U.S.  Navy  Aircrew  Automated  Escape  Systems,"  dated 
15  May  1976. 
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1.2  Rationale  for  Using  the  Statistical  and 

Deterministic  Techniques  Employed 

In  any  program  of  the  kind  addressed  in  this  report,  where  a well- 
defined  analytical  path  does  not  exist,  inevitably  some  trial  and  error  will 
result:  Some  techniques  tried  and  shelved;  other  techniques,  not  originally 

spelled  out,  will  be  tried  with  excellent  results.  Experiments  to  be  per- 
formed on  the  given  data  frequently  is  a major  problem.  Questions  which  are 
asked  frequently  dictate  the  analyses  to  be  applied.  Consider  the  following: 

1.2.1.  Does  any  relationship  exist  among  the  various  high  perform- 
ance aircraft  ejection  related  fatalities  and  causes  of 
the  fatalities? 

This  question  can  be  answered  by  application  of  the  Two- 
Way  Analysis  of  Variance  Matrix,  One  Observation  per  Cell. 

1.2.2.  Does  any  relationship  exist  between  two  separate  ejection 
seats  used  and  causes  of  ejection  related  fatalities? 

This  too  is  a question  which  can  be  answered  by  a Two- 
Way  Analysis  of  Variance  Technique. 


1.2.3.  Do  injury  data  sets  contain  an  underlying  trend? 

This  can  be  answered  by  application  of  the  non-parametric 
trend  test. 

1.2.4.  On  a chronological  basis,  do  ejection  related  fatalities 
occur  randomly  over  time? 

This  question  can  be  answered  upon  application  of  the  non- 
parametric  run  test. 


1.2.5.  Can  future  ejection  related  fatality  patterns,  as  well  as 
other  patterns,  be  predicted  mathematically? 

Contingent  upon  results  from  hypothesis  testing  and  non- 
parametric  tests,  this  question  can  be  answered  by  apply- 
ing deterministic  and  numerical  techniques. 

1.2.6.  Can  future  ejection  related  injury  patterns  be  statisti- 
cally predicted? 
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The  answer  to  this  question  hinges  upon  proper  formulation 
and  testing  of  hypotheses  about  the  ejection  related  in- 
jury data  sets  under  investigation. 


b'-  < 


These  questions  are  representative,  not  exhaustive.  Results  of 
our  investigations  are  summarized  below. 

1 . 3 Results  Obtained  from  Applying  Statistical  and  Deterministic 
Analyses  to  Ejection  Related  Injury  Data 

In  statistical  analyses,  answers  are  rarely  absolute.  Generally 
they  are  given  with  a confidence  level  attached.  Subject  to  this  understand- 
ing, the  following  are  results  obtained  to  date: 
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1.3.1(A)  A difference  exists  among  the  average  number  of  ejection 

related  fatalities  caused  by  (a)  hardware  hazards,  (b)  air- 
crew judgment,  and  (c)  environmental  conditions,  when 
ejections  are  considered  on  an  aircraft-by-aircraft  basis. 

1.3.1(B)  A difference  exists  among  the  average  number  of  ejection 
related  fatalities  associated  with  ejection  from  the  A-4, 
A-7,  A-6,  F-4,  and  F-8  U.S.  Navy  high  performance  aircraft. 

1.3.2(A)  No  difference  exists  among  the  average  number  of  ejection 
related  fatalities  caused  by  (a)  hardware  hazards,  (b)  air- 
crew judgment,  and  (c)  environmental  conditions,  when 
ejections  are  considered  from  ESCAPAC  or  Martin-Baker 
ejection  seats. 

1.3.2(B)  No  difference  exists  among  the  average  number  of  ejection 
related  fatalities  upon  ejection  with  either  the  ESCAPAC 
or  Martin-Baker  ejection  seats. 

1.3.3.  On  a chronological  basis,  the  A-4,  A-6,  A-7,  F-4,  and  F-8 
ejection  related  injury  data  individually  contain  under- 
lying trends. 

1.3.4.  On  a chronological  basis,  fatalities  occur  randomly  upon 
ejection  from  the  A-4,  A-7,  F-4,  and  F-8  aircraft. 

1.3.5.  On  a chronological  basis,  fatalities  do  not  occur  randomly 
upon  ejection  from  the  A-6  aircraft. 

1.3.6.  Based  on  the  indicated  non-random  behavior  in  the  A-6 
data,  future  ejection  related  fatality  patterns  have  been 
predicted. 
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1.3.7.  If  a parent  probability  density  function  can  be  derived, 
future  ejection  related  injury  patterns  can  be  predicted. 
These  injury  patterns  have  been  predicted  in  this  report 
for  the  A-6  aircraft  (generalized  gamma  probability  den- 
sity function);  the  A-4  aircraft  (exponential  density 
function);  and  seven  aircraft  (A-4,  A-5,  A-6,  A-7,  F-4, 
F-8,  F-9)  taken  all  together  as  a random  sample  of  size 
977  ejections  (exponential  density  function). 

1.3.8.  For  the  A-6  aircraft,  it  was  discovered  that  the  least 
number  of  ejection  related  fatalities  occurred  when  the 
aircraft  was  in  the  velocity  range  of  200  to  300  knots. 

1.3.9.  A comparison  of  the  A-6  ejection  related  injury  pattern 
and  the  injury  pattern  for  seven  aircraft  (A-4,  A-5,  A-6, 
A-7,  F-4,  F-8,  F-9)  reveals  that  on  a percentage  basis, 
the  A-6  injuries  exceed  those  injuries  from  the  seven  air- 
craft studied,  in  every  injury  category  except  minimal 
injuries. 

1.3.10.  The  seven  aircraft  mentioned  above  have  a fatality  rate, 
over  the  years  studied  (1969  through  1975),  of  14.7  per- 
cent of  all  ejections.  The  A-6  fatality  rate  is  19.6 
percent  of  all  A-6  ejections.  Thus,  the  A-6  percentage 
fatality  rate  is  33.34  percent  higher  than  the  fatality 
rate  for  the  seven  aircraft. 


1 . 4 Suggestions  for  Future  Investigations 

The  statistical  investigations  reported  herein,  while  interesting 
and  informative,  should  be  viewed  as  beginning  investigations  only.  They  have 
highlighted  a real  need  for  in-depth  study  of  various  problem  areas.  A sum- 
mary of  areas  which  have  an  immediate  need  to  be  studied  are  the  following: 

1.4.1.  An  in-depth  investigation  of  all  A-6  ejections. 

1.4.2.  Determine  the  underlying  injury  trends  in  the  A-4,  A-6, 
A-7,  F-4,  and  F-8  ejection  related  injury  data. 

1.4.3.  Reconcile  any  apparently  inconsistent  results  detected  by 
the  analysis  of  variance  technique. 

1.4.4.  After  sufficient  analytical  results  have  been  obtained 
from  both  preliminary  and  in-depth  investigations,  the 
question  of  why  particular  ejection  injury  patterns  exist 
needs  to  be  addressed. 

1.4.5.  Statistical  analysis  techniques  should  continue  to  be 
applied  to  additional  ejection  data. 
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2.0  Introduction 


This  report  contains  detailed  descriptions  of  results  obtained  upon  ap- 
plying statistical  analyses  to  actual  Medical  Officer's  Reports  (MORs)  ejec- 
tion injury  data.  These  data  were  developed  on  a case-by-case  study  of  each 
aircrew  ejection  from  U.S.  Navy  high  performance  aircraft  over  the  time  period 
1 January  1969,  through  31  December  1975.  Statistical  analyses  applied  were 
developed  in  CACI's  report,  "Statistical  Analysis  Methodology  for  the  U.S. 

Navy  Aircrew  Automated  Escape  Systems,"  dated  15  May  1976. 

To  get  an  overview  of  NAVAIR  resource  availability,  consider  Figure  2-1. 
This  figure  delineates  resource  availability  into  four  major  groups:  (1)  Sys- 

tem Availability,  (2)  Ejection  Rationale,  (3)  Search  and  Rescue  History,  and 
(4)  Medical  Profile.  Each  of  these,  as  the  figure  shows,  is  further  sub- 
divided into  a series  of  binary  decision  branches.  Stability  of  this  resource 
model  is  realized  whenever  resource  input,  here  interpreted  as  new  crewmen,  new 
carriers,  new  aircraft  and  new  SAR  equipment  and  personnel,  equals  resource 
output,  interpreted  here  as  obsolescence,  outmoded  techniques,  injured  crew- 
men, depreciation  of  equipment,  and  attrition  of  personnel. 

To  assist  in  developing  an  in-depth  understanding  of  the  medical  profile 
portion  of  the  entire  system,  a statistical  analysis  flow  diagram,  depicted  in 
Figure  2-2,  was  derived.  Not  all  statistical  techniques  shown  in  that  fig- 
ure were  used  in  this  report.  A streamlined  version  of  the  techniques  used 
in  this  report  is  illustrated  in  Figure  2-3. 

The  first  technique  applied  to  the  data  was  the  Two-Way  Analysis  of  Vari- 
ance, one  observation  per  cell.  This  statistical  analysis  technique  was  ap- 
plied specifically  to  fatalities  incurred  in  connection  with  aircrew  ejection 
from  U.S.  Navy  high  performance  aircraft.  Results  are  detailed  in  the  next 
section. 

Another  useful  technique  applied  to  the  data  is  that  of  non-parametric 
tests.  The  two  tests  employed  here  are  the  trend  test  and  the  run  test.  The 
trend  test  was  applied  to  all  A-4,  A-5,  A-6,  A-7,  F-4,  F-8,  and  F-9  injury 
data.  The  run  test  was  applied  to  all  ejection  related  fatality  data  from 
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FIGURE  2-3.  STATISTICAL  AND  DETERMINISTIC  ANALYSES  FOR  SINGLE  ATTRIBUTE  DATA  STREAMS 


A-4,  A-6,  A-7,  F-4,  and  F-8  aircraft.  Results  from  the  run  test  strongly 
suggest  that  A-6  fatalities  do  not  occur  randomly.  Accordingly,  a separate 
section  is  devoted  exclusively  to  an  analysis  of  A-6  fatalities. 

One  desirable  end  product  of  analysis  is  the  ability  to  predict  future 
patterns.  Although  some  hypotheses  were  formulated  and  tested  in  the  anal- 
yses already  mentioned,  it  was  believed  appropriate  to  formulate  and  test  addi 
tional  hypotheses.  These  relate  to  determining  the  parent  probability  density 
functions  from  which  the  various  ejection  injury  samples  were  extracted.  If 
the  parent  density  function  can  be  derived,  future  injury  patterns  can  be  pre- 
dicted. Three  such  ejection  parent  probability  density  functions  were  derived 
(1)  for  the  A-4  injuries,  (2)  the  A-6  injuries,  and  (3)  all  ejection  injuries 
from  seven  high  performance  aircraft  taken  collectively.  All  three  parent 
density  functions  were  gamma  probability  density  functions.  The  A-4  injury 
pattern  and  the  collective  aircraft  injury  pattern  were  samples  extracted 
from  the  exponential  form  of  the  gamma  density  function.  Goodness-of-fit  was 
tested  with  the  Chi-square  statistic. 


This  report  is  concluded  by  a summary  of  all  results  obtained  to  date. 


Section  3.0 


TWO-WAY  ANALYSIS  OF  VARIANCE  APPLIED  TO 
FATALITIES  EXPERIENCED  UPON  EJECTION 
FROM  SELECTED  U.S.  NAVY  HIGH 
PERFORMANCE  AIRCRAFT 


3.0  Two-Way  Analysis  of  Variance  Applied  to  Fatalities  Experienced 
Upon  Election  from  Selected  U.S.  Navy  High  Performance  Aircraft 


Analysis  of  variance  is  a statistical  technique  which  can  be  used  to  make 
inferences  about  a set  of  observations,  previously  classified  according  to  two 
criteria,  displayed  in  a rectangular  array.  For  example,  it  can  be  used  to 
analyze  a rectangular  array  consisting  of  averages  of  ejection  related  fatali- 
ties caused  by  (1)  hardware,  (2)  aircrew,  and  (3)  environment  versus  the 
ejections  which  occur  from  the  following  aircraft:  (1)  A-4,  (2)  A-7,  (3)  A-6, 

(4)  F-4,  and  (5)  the  F-8.  Examples  of  the  three  primary  causes  of  fatalities 
are  the  following: 


• Hardware : 


(a)  Struck  by  personal  equipment 

(b)  Parachute  dragging 


• Aircrew: 


(a)  Attempted  ejection  outside  of  safety  envelope 

(b)  Misuse  of  survival  equipment 


e Environment : 


(a)  Windblast 

(b)  Initial  impact  with  the  terrain. 


A separate  Two-Way  Analysis  of  Variance  was  performed  when  the  following 
ejection  systems  were  considered:  (1)  ESCAPAC  and  (2)  Martin-Baker.  The  three 

primary  causes  of  fatalities  (hardware,  aircrew,  and  environment)  remain  un- 
changed. Each  of  the  above  analyses  will  be  discussed  separately. 


Raw  input  data  to  the  analyses  are  shown  in  Table  3-1  taken  from  the  Medi- 
cal Officer's  Reports  (MORs) . Data  from  Table  3-1  were  categorized  and  are  sum- 
marized in  Table  3-2.  These  were  raw  input  to  a Two-Way  Analysis  of  Variance 
routine,  one  observation  per  cell.  Data  were  then  normalized  to  100  ejections 
as  shown  in  Table  3-3,  and  these  numbers  were  the  refined  input  to  the  Analysis 
of  Variance  routine. 

Equations  used  in  the  computations  are  summarized  below: 
r =»  number  of  rows  in  Table  3-3 
c * number  of  columns  in  Table  3-3 
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Environment;  H - Hardware 


TABLE  3-2.  FATALITY  HISTORY  UPON  EJECTION  FROM  U.S.  NAVY 
HIGH  PERFORMANCE  AIRCRAFT  (Actual  Data) 


N.  EJECTION 
SEAT 

| OTHER 
▼CAUSES  N. 

ESCAPAC  EJECTION  SEAT 

MARTIN-BAKER  EJECTION  SEAT 

GRAND 

N.  AIRCRAFT 
N.  CAUSE 

| OTHER 
▼CAUSES 

1 

TOTAL 

ESCAPAC 

FATALITIES 

1 

F-8 

TOTAL 

MARTIN- 

BAKER 

FATALITIES 

TOTAL 

Hardware 

10 

6 

16 

5 

12 

2 

19 

35 

Aircrew 

15 

9 

24 

11 

19 

H 

37 

61 

Environment 

12 

3 

15 

5 

12 

2 

19 

34 

37 

18 

55 

21 

43 

11 

75 

130 

EJECTIONS 

241 

163 

404 

107 

291 

102 

500 

904 

TABLE  3-3.  TWO-WAY  ANALYSIS  OF  VARIANCE  (One  Observation  per  Cell) 
AIRCRAFT  CAUSES  VERSUS  HARDWARE,  AIRCREW, 

AND  ENVIRONMENT  CAUSES  OF  FATALITIES 
(Per  100  Ejections) 


*■  , 

♦ • 


!| 


m 

i s 


Ri  = E x 


3-1 


ij 


Cj  " ^ Xij 


G-t  E x 

i=l  j-1  2 


(3.1) 


.th 


x.  = number  in  the  ij  cell  in  the  rectangular  array  (Table  3-3,  for  example), 


ij 


r R2 


SSR 


i=l 


c rc 


(3.2) 


ssc 


C c2  2 


3-1 


rc 


(3.3) 


SSF.  = S J 

i-1  j=l 


r R2 


c c2 


ij 


i=l  j=l 


(3.4) 


SST 


= E E 


i-l  3=1 


ij  rc 


(3.5) 


Numerical  values  from  Table  3-3  were  inserted  into  equations  (3.1)  - (3.5)  and 
the  output  is  summarized  in  Table  3-4. 


The  following  null  hypotheses  were  formulated  using  a common  denominator 
of  100  ejections  from  each  aircraft: 


H 


o.r 


There  is  no  difference  in  the  average  number  fatalities 
caused  by  hardware  hazards,  aircrew  actions,  and  envi- 
ronmental conditions. 


H : 


o,c 


There  is  no  difference  in  the  average  number  of  fatalities 
caused  by  ejection  from  the  A-4,  A-7,  A-6,  F-4,  and  F-8 
aircraft . 
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at 


The  null  hypotheses  for  the  rows,  Hq  , and  the  columns,  Hq  ^ , were  tested 
the  a = 0.05  level  of  significance.  Results  follow  in  Table  3-4. 


f ; 

* 

b 


*•  t 

f- 


TABLE  3-4.  TWO-WAY  ANALYSIS  OF  VARIANCE— AIRCRAFT  EJECTION  FATALITIES 
CAUSED  BY  HARDWARE,  AIRCREW,  AND  ENVIRONMENTAL 
CONDITIONS  UPON  EJECTION  FROM  VARIOUS  U.S. 

NAVY  HIGH  PERFORMANCE  AIRCRAFT 


Degrees  of 
Freedom 

Sum  of 
Squares 

Mean  Square 

F-Statistic 

(Computed) 

Between 

(r-1) 

(SSR) 

(SSR)/ (r-1) 

MSR/MSE 

Rows 

2 

40.93 

20.47 

19.50 

Between 

(c-1) 

(SSC) 

(SSC) /(c-1) 

MSC/MSE 

Columns 

4 

16.40 

4.1 

3.90 

Residual 

(r-1) (c-1) 

(SSE) 

(SSE) 

(r-1) (c-1) 

8 

8.4 

1.05 

TOTAL: 

(rc-1) 

(SST) 

14 

65.73 

Level  of 
Significance 

Critical 

F-Statistic 

Computed 

F-Statistic 

Inference 

Rows 

a = 0.05 

F (2,8) 
o,r 

= 4.46 

n 

Reject  H 
J o,r 

Columns 

a = 0.05 

Fn  „<4’8> 

O j c 

= 3.84 

"o,C(4’8> 
= 3.90 

Reject  H 
J o,c 

From  these  results,  the  conclusion  is  reached  that  there  is  a significant 
difference  in  the  average  number  of  fatalities  caused  by  hardware  hazards,  air- 
crew actions,  and  environmental  conditions.  Moreover,  a significant  difference 
exists  in  the  average  number  of  fatalities  which  occur  upon  ejection  from  the 
various  aircraft  studied. 
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A completely  analogous  analysis  was  performed  for  comparison  of  fatalities 
caused  by  hardware  hazards,  aircrew  actions,  and  environmental  conditions  when- 
ever ejections  were  initiated  from  two  ejections  seat  systems:  (1)  ESCAPAC, 

and  (2)  the  Martin-Baker  ejection  seat.  Refined  input  to  the  analysis  of  vari- 
ance procedure  is  displayed  in  Table  3-5.  Numerical  results,  obtained  from 
equations  (3.1)  - (3.5)  are  shown  in  Table  3-6. 


The  following  null  hypotheses  were  formulated: 


H : 


o,r 


H : 


o,c 


There  is  no  difference  in  the  average  number  of  fatalities 
caused  by  hardware  hazards,  aircrew  actions,  and  environ- 
mental conditions. 

There  is  no  difference  in  the  average  number  of  fatalities 
caused  upon  ejection  from  either  the  ESCAPAC  ejection  seat 
or  the  Martin-Baker  ejection  seat. 


The  null  hypotheses  for  the  rows, 
at  the  a = 0.05  level  of  significance. 


H , and  the  columns,  H , were  tested 
o,r’  o,c* 

Results  follow  in  Table  3-6. 


TABLE  3-5.  TWO-WAY  ANALYSIS  OF  VARIANCE  (One  Observation  Per  Cell) 
EJECTION  SEAT  CAUSES  VERSUS  HARDWARE,  AIRCREW, 

AND  ENVIRONMENT  CAUSES  OF  FATALITIES  PER 
500  EJECTIONS 


EJECTION 

SEAT 

(CAUSE  OFN. 

♦fatality  \ 

ESCAPAC 

MARTIN-BAKER 

TOTAL 

Hardware 

20 

19 

39 

Aircrew 

30 

37 

67 

Environment 

19 

19 

38 

TOTAL: 

69 

75 

144 

I 


m 

» i 

r 


TABLE  3-6.  TWO-WAY  ANALYSIS  OF  VARIANCE— EJECTION  SEAT  FATALITIES 
CAUSED  BY  HARDWARE,  AIRCREW,  AND  ENVIRONMENTAL 
CONDITIONS  UPON  EJECTION,  USING  VARIOUS 
EJECTION  SEATS,  IN  U.S.  NAVY  HIGH 
PERFORMANCE  AIRCRAFT  PER  500  EJECTIONS 


Between 


Degrees  of 
Freedom 


(r-1) 


Sum  of 
Squares 


(SSR) 


Mean  Square 


(SSR) /(r-1) 


F-Statistic 

(Computed) 


MSR/MSE 


Between 

Columns 


(c— 1) 


(ssc) 


(SSC)/ (c-1) 


MSC/MSE 


Columns 


Level  of 
Significance 


a = 0.05 


a = 0.05 


Critical 

F-Statistic 


F (1,2) 
o.c 


Computed 

F-Statistic 


14.26 


Inference 


F (2,2)  Accept  H 

o,r  * r o,r 


Accept  H 


From  these  results,  the  conclusion  is  reached  that  there  is  no  signifi- 
cant difference  in  the  average  number  of  fatalities  caused  by  hardware  hazards, 
aircrew  actions,  and  environmental  conditions.  No  difference  was  discovered 
when  a comparison  was  made  of  the  two  ejection  system  types. 

An  apparent  discrepancy  exists  when  comparing  the  results  of  the  above 
Analysis  of  Variance  tests.  In  one  case  a significant  difference  exists,  and 
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in  another  practically  analogous  case,  there  is  no  significant  difference  in 
injury  patterns.  Further  investigation  is  required  to  account  for  this  phe- 
nomenon. It  may  be  that  the  data  tabulation,  the  data  classification,  or 
the  data  interpretation  is  in  error.  This  might  explain  the  inconsistency. 
In  any  event,  under  the  assumptions  cited  at  the  beginning  of  this  section, 
there  seems  to  be,  in  certain  cases,  some  dependency  among  the  variables 
analyzed. 
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4.0  Non-Parametric  Statistical  Tests 


Two  non-parametric  statistical  tests  of  particular  importance  in  the  pres- 
ent investigation  are:  (1)  the  one  sample  trend  test,  and  (2)  the  one  sample 

run  test.  The  trend  test  was  applied  to  all  A-4,  A-5,  A-6,  A-7,  F-4,  F-8,  and 
F-9  ejection  injury  data  acquired  over  the  time  period  1 January  1969,  through 
31  December  1975.  Non-parametric  run  tests  were  applied  to  ejection  related 
fatality  data,  including  "lost  and  unknown,"  for  the  A-4,  A-6,  A-7,  F-4  and 
F-8  aircraft.  It  was  discovered  that  on  a chronological  time  basis,  fatalities 
do  not  occur  randomly  upon  ejection  from  A-6  aircraft.  Consequently,  a sepa- 
rate section  in  this  report  has  been  devoted  to  an  analysis  of  A-6  ejection 
related  fatalities. 


4.1  Non-Parametric  One  Sample  Trend  Analysis  of 
A-4  Ejection  Related  Injury  Data 


A trend  analysis  is  a non-parametric  statistical  test  used  to  dis- 
cover whether  the  data  sample  under  Investigation  contains  an  underlying  trend. 
The  procedure  is  straightforward:  Select  the  first  data  point  in  the  sample 

and  compare  its  magnitude  with  the  magnitude  all  successive  data  points.  Count 
and  record  the  number  of  times  its  magnitude  exceeds  the  magnitude  of  all  other 
data  points.  Continue  in  this  manner  counting  and  recording  the  number  of 
times  > x^  for  i < j.  Tabulate  the  results,  then  add  the  number  of  "trends" 
so  obtained.  Finally,  apply  normal  density  function  theory  to  infer  a conclu- 
sion about  an  underlying  trend  in  the  data  sample. 


To  apply  this  to  a specific  example,  Table  4-1  was  constructed  from 
actual  MORs  data.  To  quantify  MORs  data,  the  following  correspondences  were 
established : 


MORs  Data 


Data  in 
Table  4-1 


G 

F 

B 

A 

L & U 


1 

2 

3 

4 

5 


4-1 


1 


.f. 


.... — — 


TABLE  4-1.  NON-PARAMETRIC  ONE  SAMPLE  TREND  ANALYSIS  OF 
A-4  EJECTION  RELATED  INJURY  DATA 


2,  1,  2,  1,  1,  1,  1,  2.  3,  5,  1,  1,  1,  4,  2,  2,  2,  2,  2,  2,  2,  1,  2, 

2,  2,  3,  1,  2,  1.  4,  1.  4,  1,  2,  1,  3,  2,  3,  4,  2.  I,  1,  2,  2,  4,  3, 

1,  1,  1,  4,  1,  2,  1,  1.  3,  1,  2,  1,  1,  1,  2,  4.  2,  1,  3,  3,  4,  1,  2, 

2,  4,  2,  2,  1,  5,  1,  2,  2,  1,  2,  3,  4,  2,  3,  1,  2,  1,  3,  1,  2,  2.  3, 

2,  1,  1,  4,  1,  2,  2,  2,  1,  2,  1,  3,  1,  1,  1,  1.  3,  1,  3,  1,  1,  3,  3, 

1.  1,  1,  3,  2,  2,  2,  1,  3,  3,  1,  2,  3,  1,  1,  4,  1,  2,  4,  1,  1,  2,  3, 

4,  5,  1,  1,  1,  1,  2,  3,  4,  2,  1,  4.  3,  2,  2,  3,  3,  1,  1,  1,  1,  1,  3, 

2,  1,  2,  1,  3,  3,  1,  1,  4,  4.  4,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1, 

1,  2,  1,  1,  3,  1,  5,  1,  2,  1,  1,  1,  3,  1,  4,  4,  2,  5,  1,  1,  1,  4,  1, 

1,  1.  2,  3,  4,  1,  4,  1,  2,  4,  1,  5,  1,  4,  3,  1,  4,  1,  3,  1,  3,  1,  1, 

2,  2,  5,  1,  2,  5,  1,  1,  2,  4,  4. 


Now  formulate  the  null  hypothesis: 


Hq:  The  A-4  Ejection  Injury  data  does  not  contain  an  underlying  trend. 


To  test  this  hypothesis  at  the  a = 0.05  level  of  significance,  first  compute 
N^  and  N2,  where 


N,  - - 1.96  a + p 

1 y y 

N2  - + 1.96  ay  + py 

and 


(4.1) 


N(N  - 1)  I 

% “ 4 I 

| (4.2) 

2 2 (N) 3 + 3(N)2  - 5(N)  | 

°y  * 72  I 


N » number  of  data  points  under  analysis,  in  this  case  N - 241. 

From  equations  (4.1)  and  (4.2),  - 13,590.5,  ^ * 15,329.5,  and 

from  Table  4-2,  Nc  = 10,066.  Here,  is  the  number  of  reverse  arrangements 
in  the  data  sample.  The  acceptance  region  is  N^  < N£  < N2.  Since  N£  < N^, 
then  clearly  the  null  hypothesis  is  rejected,  and  we  conclude  that  with  95  per- 
cent confidence,  the  A-4  ejection  related  injury  data  does  contain  an  under- 
lying trend. 

A flow  diagram  illustrating  procedure  for  implementing  a non-parametric 
trend  analysis  on  a digital  computer  is  displayed  in  Figure  4-1. 


TABLE  4-2.  TRENDS  OBSERVED  IN  A NON-PARAMETRIC  ONE  SAMPLE  TREND 
ANALYSIS  OF  A-4  EJECTION  RELATED  INJURY  DATA 


109 

, 0, 

108,  0,  0,  0,  0, 

104,  160,  224 

, 0, 

0, 

0, 

193 

, 101,  101, 

101, 

101 

, 101,  101,  101,  0,  100, 

100 

, 100, 

147, 

0, 

99, 

0, 

174 

, 173,  0 

, 96, 

0, 

140, 

95,  139,  170,  95 

, 0 

, 0, 

93,  93 

, 165, 

134, 

0, 

0, 

0,  161, 

0, 

89, 

0, 

0,  127,  0,  86,  0, 

0, 

0, 

83,  150 

, 83 

, 0 

, 119, 

119, 

146,  0, 

81, 

81, 

143 

, 81,  81,  0,  160, 

0, 

79, 

79,  0, 

78, 

108,  134, 

78, 

107,  0, 

77, 

0, 

104, 

0,  75,  75,  101, 

75, 

0, 

0,  123, 

0, 

72, 

72, 

72 

, 0, 

71,  0, 

91, 

0, 

0,  0 

',  0,  87,  0,  86,  0 

, 0 

, 84 

, 84,  0 

, 0, 

0, 

81, 

60 

, 60 

, 0,  77, 

77, 

0, 

58, 

75,  0,  0,  89,  0, 

55, 

87, 

0,  0, 

53, 

68, 

83, 

83 

, 0, 

0,  0,  0 

, 49, 

63, 

77, 

49,  0,  75,  61,  48, 

48, 

59,  59, 

0, 

0, 

0,  0 

, 0 

, 54 

, 43,  0, 

42, 

0, 

50, 

50,  0,  0,  56,  56, 

56 

, 0, 

0,  0, 

0,  0 

, 0 

, 0, 

0, 

0, 

0 , 0,  0, 

0, 

26, 

0, 

0,  32,  0,  48,  0, 

22, 

0, 

0,  0,  26,  0 

, 29,  29, 

18, 

36,  0,  0 

, 0, 

25, 

0, 

0,  0,  12,  17,  20, 

0, 

19, 

0,  10, 

17, 

0, 

20, 

0, 

15, 

12,  0, 

13, 

0, 

10, 

0,  9,  0,  0,  3,  3, 

7, 

0, 

2,  5,  0 

, 0, 

0, 

0, 

0. 

4.2  Non-Parametric  One  Sample  Run  Test  Applied  to 
A- 4 Ejection  Related  Fatality  Data 


One  statistical  technique  that  can  be  used  to  test  randomness  in  a 
chronologically  stored  data  stream  is  that  of  the  one-sample  run  test.  A 
dichotomous  situation  needs  to  be  constructed  prior  to  application  of  the  test. 
A run  is  defined  as  a set  of  symbols  of  one  kind  preceded  and  followed  by 
symbols  of  another  kind  or  no  symbols  at  all. 

For  the  high  performance  aircraft  cited  in  the  MORs,  there  is  a 
minimum  of  six  dichotomies  which  can  be  constructed  from  data  from  each  air- 
craft listed.  The  aircraft  selected  for  this  analysis  was  the  A-4,  and  the 
dichotomy  of  interest  is  that  of  fatalities  versus  all  other  injuries.  As 
mentioned  in  Table  4-3,  a 1 represents  all  injuries  except  known  fatalities, 
and  a 0 represents  known  fatalities.  A lost  or  unknown  ejection  also  was 
set  to  a 1. 


TABLE  4-3.  AN  ANALYSIS  OF  A RUN  TEST  FOR  A-4  DICHOTOMOUS 
EJECTION  DATA  (1  = Non-Fatal  Injury; 

0 = Fatal  Injury) 


1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

0, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

0, 

1, 

0, 

0, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

0, 

1, 

0, 

1, 

1, 

0, 

1, 

1, 

1, 

0, 

1, 

1, 

0, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

1, 

0, 

0. 

The  null  hypothesis  to  be  tested  Is  the  following: 


H : Upon  ejection  from  A-4  aircraft,  known  fatalities 

occur  randomly. 


The  procedure  for  testing  validity,  or  not,  of  Hq  is  straightforward.  Count 
the  number,  n^,  of  l's,  the  number  n2  of  0's,  and  the  number  of  runs,  u.  From 
Table  4-3,  the  following  is  observed: 

nl  = 212  ) 

n2  = 29  (4.3] 

u = 50  I 

From  the  statistical  theory  of  runs. 


E(u) 


2 n^  n 
n^  + n 


2 

2 


+ 1, 


(4.4] 


and  if  u > 10, 


Var  (u) 


2 ^ n2  (2  n^  n2  - n - n2> 
2 

(nx  + n2)  (n^  + n2  - 1) 


z 

c 


u - E(u) 
VVar  (u) 


(4.5: 


(4.6; 


Substitute  the  numbers  from  equation  (4.3)  into  equations  (4.4)  - (4.6)  to  get: 


From  tabulated  values. 


E(u)  = 52.020 

Var(u)  * 10.634 

z = - 0.6194 
c 


z-0.025 


1.96 


z0.025 


+ 1.96. 


(4.7) 


(4.8) 
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Since  z_^  < zc  < Z+0  025’  that  is’  since  < -0.6194  < +1.96,  the 

null  hypothesis  of  randomness  cannot  be  rejected  at  the  a = 0.05  level  of  sig 
nificance.  That  is,  it  can  be  said  with  95  percent  confidence  that,  from  re- 
sults of  the  run  test,  fatalities  upon  ejection  from  A-4  U.S.  Navy  high  per- 
formance aircraft  occur  randomly.  An  overview  computer  flow  diagram  for 
computing  run  tests  is  shown  in  Figure  4-2. 

This  non-parametric  statistical  run  test  was  applied  to  ejection 
related  fatality  data  on  the  A-6,  A-7,  F-4,  and  F-8  aircraft.  The  major  con- 
clusion inferred  from  these  analyses  is  that  on  a chronological  time  basis, 
ejection  related  fatalities  associated  with  all  aircraft  mentioned  above, 
except  the  A-6 . occur  randomly.  A special  analysis  of  A-6  fatalities  is 
presented  in  Section  5. 


I 

(V) 

FIGURE  4-2.  NON-PARAMETRIC  STATISTICAL  RUN  TEST 
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5.0  An  Analysis  of  A- 6 Ejection  Related  Fatalities 


A statistical  test  called  the  run  test  was  applied  to  A-6  ejection  fatal- 
ity data  so  that  an  inference  could  be  drawn  from  the  following  null  hypothesis: 

H : Upon  ejection  from  A-6  aircraft,  known  fatalities  occur 

randomly  on  a chronological  time  basis. 

Under  the  run  test,  this  hypothesis  was  rejected  at  the  a = 0.05  level  of 
significance.  Stated  another  way,  it  can  be  said  with  95  percent  confidence 
that  fatalities  do  not  occur  randomly  upon  ejection  from  the  A-6  aircraft.  An 
analysis  to  ascertain  whether  A-6  ejection  fatalities  occur  in  a deterministic 
pattern  then  was  conducted.  Predictions  about  the  occurrence  of  future  fatal- 
ities was  then  made  from  the  given  pattern.  Here  it  must  be  emphasized  that 
any  prediction  about  future  fatality  patterns  is  predicated  on  the  assumption 
that  future  ejection  conditions  are  similar  to  present  and  past  conditions. 


5. 1 A Deterministic  Analysis  of  A-6  Ejection 
Related  Fatality  Patterns 

To  start  the  analysis,  a tabulation  of  some  information  about  A-6 
fatality  occurrences  was  constructed.  These  data  are  shown  in  Table  5-1.  A 
graph  of  the  total  fatalities,  T^,  versus  decimal  approximation  of  the  date, 
t^,  clearly  demonstrates  that  the  data  fall  quite  naturally  into  three  linear 
clusters:  from  7 March  1969  through  12  June  1970;  from  27  May  1972 

through  19  September  1973;  and  from  9 October  1974  through  20  August  1975. 


An  analysis  was  performed  to  fit  each  line  segment  to  the  given  data 
points  such  that  the  fit  was  best  in  the  least  squares  sense.  The  following 
equations  were  used  to  compute  the  slope  and  intercept,  respectively,  for  each 
line  segment. 


m = 
j ti 


E Vi  * nj  Yj 

i=l 


j 

E 

i=l 


2 -2 
- Vj 


i 1 » 2 , ...  , n . 


j = 1,  2,  3. 


(5.1) 


b 


j 


"j 


(5.2) 
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TABLE  5-1.  AN  ANALYSIS  OF  A-6  EJECTION  RELATED  FATALITY  DATA 


r 


Date 

Day 

Kind  of 

Kind  of 

fci 

Aircraft 

Fatality 

03-07-69 

66 

A- 6 A 

A 

03-07-69 

66 

A- 6 A 

A 

08-26-69 

238 

A-6A 

A 

12-26-69 

360 

A-6A 

A 

12-26-69 

360 

A-6A 

L 

02-16-70 

47 

A- 6 A 

L 

04-22-70 

112 

A- 6 A 

A 

06-12-70 

163 

A-6A 

A 

05-27-72 

148 

A-6E 

A 

08-09-72 

222 

KA-6D 

A 

10-24-72 

298 

A- 6 A 

L 

10-31-72 

305 

KA-6D 

A 

10-31-72 

305 

KA-6D 

A 

09-19-73 

262 

A-6A 

A 

09-19-73 

262 

A- 6 A 

A 

10-09-74 

282 

KA-6D 

L 

11-20-74 

324 

A-6A 

A 

11-20-74 

324 

A-6A 

A 

01-13-75 

13 

EA-6B 

L 

06-25-75 

176 

A-6E 

L 

08-20-75 

232 

KA-6D 

L 

5.7726 

5.8877 

5.8877 

6.0356 

6.4822 

6.6356 


Here 


VSV'j  “d  VEV-y 


(5.3) 


Apply  the  above  to  data  in  Table  5-1  to  get: 


L1 : T = 4.93346  tT  + 0.26413  , 

1 Lx  Lx 


(5.4) 


L„ : T.  = 3.79448  tT  - 3.13869  , 

2 l2  l2 


(5.5) 


L,:  Tt  = 4.93626  tT  - 11.69458 

3 l3  L3 


(5.6) 


Next,  the  centroids  of  the  line  segments  were  computed,  graphed,  and 
a curve  drawn  through  them.  Coordinates  of  the  centroids  are  as  follows: 


Tt  = 4.5;  tT  = 0.8586 
L1  L1 


Tt  = 12.0;  tT  = 3.98966 
L2  L2 


(5.7) 


Tt  = 18.5;  tT  = 6.1169 
L3  L3 


A graph  of  these  points  appeared  to  follow  a quadratic  of  the  form 


T = £ \ ^ 

k=0 


(5.8) 


By  using  an  arithmetic  averaging  scheme,  it  was  easy  to  compute  A^  in  equation 
(5.8).  The  complete  quadratic  is: 


T = 2.8734476  + 1.786602  t + 0.125562  t 


(5.9) 


CTa.  it  \- ' 


To  project  into  the  future,  it  is  desired  to  find  the  centroid  of 
another  line  segment  . To  accomplish  this,  notice  that  the  time  distance 
between  the  derived  centroids  of  line  segments  L^,  L2,  and  L3  appears  to  con- 
tract according  to  a logarithmic  law.  Thus, 


At  = e 


B2  C 


(5.10) 


where 


+ t 


At 


. = 1+1  1 

= tj+1  - tj  and  t = 2 . 


and  B^,  B2  are  constants  to  be  determined.  Data  in  Table  5-2  will  assist  in 
computing  B^  and  B2. 


TABLE  5-2.  DATA  USED  FOR  COMPUTING  CONSTANTS  B AND  B 
IN  AN  EXPONENTIAL  EQUATION 


t ^j+l  + tj^2 

“ " Vi  * ls 

t ~ (tj+1  + 

0.8586 

3.13106 

2.42413 

| 3.98966 

2.12724 

5.05328 

6.1169 

The  data  in  Table  5-2  permit  equation  (5.10)  to  be  written  thus: 


0.147023  t 


At  = 4.471717  e 


From  the  definitions  given  in  the  headings  of  Table  5-2,  it  is  easy  to  derive 
the  following  iterative  scheme: 


0.147023  t 


t - 6.1169  - 2.23586  e 


D = 0 whenever  t = 6.92475.  Thus 


! 

} 

I 

t 

f 

I 

f 


\ 


I r 


F 

K . 

i 


or 


From  equation  (5.9), 


Thus, 


t2  * (2)  (t)  - t1  , 

- (2) (6.92475)  - 6.1169 
= 7.7326 

T (7.7326)  = 24.1964 


(5.13) 


t2  = 7.7326 
T = 24.1964 


(5.14) 


These  are  the  extrapolated  coordinates  of  the  projected  centroid  for  line  seg- 
ment L^.  The  slope  of  is  assumed  to  be  the  mean  of  the  slopes  of  lines 
L^,  L2»  and  L^.  Thus, 


3 

o4  = 2 V3  (5.15) 

j=l  J 


m4  = (4.93346  + 3.79448  + 4.93626)/3 

m4  « 4.554734  (5.16) 

From  equations  (5.14)  and  (5.16),  the  line  segment  L4  is  completely  determined. 
Thus, 

L.:  T_  = 4.554734  t - 11.023536  . (5.17) 

4 L4 
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To  determine  a reasonable  starting  point  on  the  line  L^,  again  assume 
that  the  time  lengths  between  the  last  points  on  preceding  line  segments  and 
first  points  on  current  line  segments  is  logarithmic  in  nature.  That  is, 


At 


(5.18) 


Proceed  in  a manner  entirely  analogous  to  that  developed  earlier,  hence  write 
an  iterative  scheme  as  follows: 


n - , -0.2193389  t 

D2  = t - 6.6356  - 1.6664237  e 


(5.19) 


D2  = 0 whenever  t 


6.9949 


7.3542. 


(t.  + t2)/2,  where  t^  = 6.6356.  Thus,  t2  = 
Translated,  this  becomes  the  7C  year,  which  is  1976,  and  (0.3542)  (366)  * 130 
days  into  1976.  That  is,  15  May  1976,  which  is  a predicted  start  date  for 
fatalities  clustered  about  the  line  segment  L^.  Average  time  length  per  line 
segment  = 1.1474  years.  Time  length  of  line  segment  L^  = 1.1458  years  over  the 
time  interval  15  May  1976  to  1 July  1977  inclusive. 


A prediction  scheme,  subject  to  the  assumptions  stated  earlier,  is 
hypothesized  in  Table  5-3  below.  These  dates  should  be  interpreted  as  "on  or 
about"  dates  since  bounds  on  the  dates  given  for  line  segments  L^,  L2,  and  L^ 
have  not  yet  been  established.  Moreover,  the  dates  given  fall  on  the  line  seg- 
ment L^.  A graphical  display  of  these  results  is  shown  in  Figure  5-1. 


TABLE  5-3.  PREDICTED  DATES  FOR  A-6  EJECTION  RELATED  FATALITIES, 
UNDER  THE  ASSUMPTION  OF  LOGARITHMIC  TIME  CONTRACTION 


Predicted 

Observed 

Date 

Fatalities 

Date 

Fatalities 

May  15,  1976 

2 

August  15,  1976 

1 

December  1,  1976 

1 

February  15,  1977 

1 

May  1,  1977 

1 

July  25,  1977 

1 

FIGURE  5-1.  GRAPH  OF  TOTAL  A-6  EJECTION  RELATED  FATALITIES  VERSUS  TIME 


Table  5-3  was  constructed  under  the  assumption  of  logarithmic  con- 
traction of  the  time  distance  between  not  only  the  line  segment  centroids,  but 
also  the  time  distance  between  the  end  of  one  line  segment  and  the  beginning 
of  the  succeeding  line  segment.  If  arithmetic  averages  are  used,  instead  of 
logarithmic  contraction,  the  following  numbers  are  easily  developed  from 
Table  5-1: 


1.  Coordinates  of  the  centroid  of  the  Line  Segment  L^: 


(tT  = 8.7461,  T.  = 25.50) 
L4  L4 


2.  Equation  of  line  segment  is: 


Tt  = 4.554734  t - 14.336159 
L4 


(5.20) 


3.  Starting  time  for  fatalities  clustered  about  line  segment  is 


t = 8.1419  years  after  1 January  1969, 
or,  tg  = 21  February  1977. 


4.  Termination  time  for  fatalities  clustered  about  line  segment 

L.  is 
4 


tg  = 9.3503  years  after  1 January  1969 
or,  tg  * 7 May  1978. 


5.  Time  length  of  line  segment  is 


t =*  1.2084  years  * 1 year  and  76  days. 

Lt 


Predicted  fatalities  under  the  assumption  of  arithmetic  averaging 
are  shown  in  Table  5-4.  A graph  of  the  line  segment  L^,  developed  under  the 
assumption  of  arithmetic  averaging,  also  is  shown  in  Figure  5-1. 


5-8 


! 


f 


TABLE  5-4.  PREDICTED  DATES  FOR  A-6  EJECTION  RELATED  FATALITIES, 
UNDER  THE  ASSUMPTION  OF  ARITHMETIC  AVERAGING  OF  TIME 


Predicted 


February  21,  1977 
May  22,  1977 
September  7,  1977 
November  22,  1977 
February  7,  1978 
April  29,  1978 


To  summarize,  it  should  be  recalled  that  the  first  fatality  predic- 
tion algorithm  was  deterministically  constructed  under  the  following  funda- 
mental assumption:  time  interval  between  the  termination  of  one  cluster  of 

fatalities  and  the  start  of  a succeeding  cluster  is  governed  by  logarithmic 
(or  exponential)  contraction.  The  second  prediction  algorithm  was  based  on 
the  assumption  that  the  above  mentioned  time  interval  is  constructed  on  the 
basis  of  an  arithmetic  average  of  the  preceding  analogous  time  intervals. 

Basic  to  the  entire  prediction  methodology,  whether  the  assumption  is  made  of 
logarithmic  time  contraction  or  arithmetic  averaging,  is  the  fundamental  as- 
sumption: Ejection  conditions  in  the  future  will  be  the  same  as  those  in  the 

past.  This  fundamental  assumption  cannot  be  overstressed.  An  analogous  sit- 
uation is  the  injury  pattern  experienced  by  aircrew  who  eject  from  the  A-6 
aircraft.  This  problem,  addressed  at  length  in  Section  6,  is  one  that  is  anal- 
yzed by  a goodness-of-f it  algorithm.  It  will  be  shown  that,  on  the  basis  of 
available  data  and  on  the  basis  of  invariance  of  ejection  scenarios,  future 
ejection  related  injury  patterns  can  be  predicted  with  a high  degree  of 
confidence. 

In  retrospect,  a non-parametric  statistical  run  test  was  applied 
to  a set  of  A-6  fatality  data.  Temporal  randomness  (randomness  in  time)  was 
Investigated.  It  was  discovered  that  A-6  ejection  related  fatalities  do  not 
occur  randomly  over  time.  A high  degree  of  confidence  can  be  placed  in  that 
assertion.  Next,  two  deterministic  prediction  schemes  were  derived:  (1)  a 


DBHUE  ^ ■ 


logarithmic  time  interval  contraction,  and  (2)  arithmetic  time  interval  aver- 
aging. Results  are  summarized  by  Tables  5-3  and  5-4,  respectively. 

Because  of  the  impact  and  severe  nature  of  making  such  predictions 
concerning  fatalities,  caution  must  be  exercised  in  interpreting  the  algorithm. 
The  basic  underlying  assumption,  to  render  a prediction  valid,  is  that  ejec- 
tion conditions  in  the  future  will  remain  the  same  as  in  the  past.  For  ex- 
ample, in  this  case,  the  ejection  regime  will  be  the  same,  the  flight  hour 
program  will  be  the  same,  the  system  functions  will  be  the  same,  and  missions 
will  be  the  same.  Regardless,  the  mathematics  of  the  prediction  scheme  is 
valid,  and  the  fact  remains  that  fatalities  did  not  occur  randomly  in  time 
upon  ejection  from  A-6  aircraft,  hence  they  can  be  described  quite  accurately 
with  deterministic  functions.  This  in  itself  is  sufficient  to  call  attention 
to  A-6  ejections  and  support  a detailed  study  into  the  fatalities  experienced 
upon  ejection  from  A-6  aircraft. 

5.2  A Statistical  Run  Test  Analysis  of  A-6  Ejection 

Related  Fatalities  Versus  Ejection  Airspeed 

One  question  pertinent  to  the  A-6  fatality  analysis  is  whether  a sta- 
tistical run  test  on  ejection  airspeed  would  give  some  insight  into  the  A-6 
ejection  fatality  problem.  To  answer  such  a question,  consider  data  in  Table 
5-5  which  lists  total  A-6  ejection  fatalities  versus  airspeed  at  which  the 
ejection  took  place.  The  particular  dichotomy  used  was  that  of  values  above 
and  below  the  median  airspeed.  From  Table  5-5,  it  is  easy  to  observe  that  the 
median  airspeed  is  240  knots,  so  that  Table  5-6  is  easy  to  derive.  In  Table 
5-6,  a,  represents  an  airspeed  above  the  median,  and,  b,  represents  airspeed 
below  the  median.  The  median  airspeed  is  not  included  in  this  test. 

Here,  a run  is  defined  as  a sequence  of  letters  of  the  same  kind 
bounded  by  letters  of  the  other  kind  or  no  letters  at  all.  The  null  hypothesis 
to  be  tested  is  whether  the  two  samples,  set  of  a's  and  set  of  b's,  are  ex- 
tracted from  the  same  parent  population  density  function.  If  the  two  samples 
are  from  the  same  parent  population,  the  a's  and  b's  will  ordinarily  be  well 
mixed  and  the  number  of  runs,  u,  will  be  large.  If  the  two  populations  are 
widely  separated  so  that  their  ranges  do  not  overlap,  the  number  of  runs  will 
be  only  2.  In  general,  differences  between  the  two  parent  populations  will 
tend  to  reduce  the  number  of  runs,  u. 


TABLE  5-5.  TOTAL  A-6  FATALITIES  VERSUS  EJECTION 
AIRSPEED  IN  KNOTS 


08-09-72 

10-24-72 


10-31-72 

10-31-72 

09-19-73 

09- 19-73 

10- 09-74 

11- 20-74 
11-20-74 
01-13-75 
06-25-75 
08-20-75 


Ejection  Airspeed 
(knots) 


TABLE  5-6.  INPUT  DATA  TO  A RUN  TEST  OF  A-6 
EJECTION  FATALITIES  VERSUS 
AIRSPEED 


OHK  m 


1 


Now  formulate  the  following  null  hypothesis: 

H : The  sample  of  a's  and  the  sample  of  b's  (sec  Table  5-6)  are 

two  random  samples  extracted  from  the  same  parent  popula- 
tion density  function. 

From  Table  5-6,  the  following  numbers  are  developed: 

n^  “ number  of  a's  = 10 

n2  m number  of  b's  = 10 


u * number  of  runs  = 4 


To  assist  with  a test  of  the  null  hypothesis,  use  the  following  exact  prob- 
ability density  function: 


; u = 2k 


f(u)  = i 


■ft:;)ft:l) 

■>) 

(v‘)ft:,‘1  • ft:!)fv‘) 

(T’) 


(5.21) 


; u = 2 k + 1 


To  test  the  null  hypothesis  at  the  a level  of  significance,  find  a 
positive  integer  uq,  so  that  as  nearly  as  possible: 


2^  f(u)  = « 

u=0 


(5.22) 


Reject  the  null  hypothesis  Hq  if  the  observed  u <_ uq.  In  this  problem,  choose 
a = 0.05. 

Using  the  numbers  developed  from  Table  5-6,  equations  (5.21)  become: 


“s Si^.,  . A. 


5-12 


Let  k 
Next , 


In  an 
f(4). 
Table 

Also, 


u - 2k  + 1 


= 0 in  equation  (5.23),  then  f(0)  ■ f(l)  “ 0. 


let  k = 1,  then 


f(2) 


f (3) 


0.000010825 


0.0000974257 


entirely  analogous  fashion,  whenever  k = 2,  3,  ...  , 10,  values  for 
f (5) ; ...  ; f(20),  f(21)  were  computed.  Complete  results  are  shown  in 
5-7.  A graph  of  results  in  Table  5-7  is  shown  in  Figure  5-2. 


7 

^ f (u ) = 0.0512557 
u=0 


6 

^ f(u)  = 0.0185207  . 
u-0 


6 7 

V f(u)  < a < V f(u)  , 


Since 


TABLE  5-7.  RESULTS  OF  COMPUTING  EXACT  PROBABILITIES 
FROM  THE  RUN  TEST 


k 

f (2k) 

f (2k  + 1) 

0 

0.000 

0.000 

1 

0.0000108 

0.0000974257 

2 

0.0008768 

0.0035073285 

3 

0.0140293 

0.0327350667 

4 

0.0763818 

0.1145727338 

5 

0.1718591 

0.1718591007 

6 

0.1718591 

0.1145727338 

7 

0.0763818 

0.0327350667 

8 

0.0140293 

0.0035073285 

9 

0.0008768 

0.0000974257 

10 

0.0000108 

0.000 

TOTAI S 

0.5263166 

0.4736842101 

and  since  u = 4 < 6 < 7,  reject  the  null  hypothesis  at  the  a level  of  signifi- 
cance, and  conclude  that  the  samples  in  the  run  of  Table  5-6  are  random  samples 
extracted  from  different  parent  population  density  functions. 

The  null  hypothesis  could  have  been  stated  as  follows:  H^:  on  a 

chronological  time  basis,  ejection  airspeed  resulting  in  a fatality,  is  a 
random  occurrence.  As  demonstrated  in  the  preceding  analysis  this  null  hy- 
pothesis of  randomness  must  be  rejected. 

It  would  be  interesting  to  compare  these  results  with  the  "normal 
approximation"  results.  Thus,  the  mean  and  variance  of  the  exact  probability 
density  functions  defined  by  equation  (5.21)  are: 


E(u) 


2 nln2 
nl  + n2 


+ 1 


Var (u) 


(5.24) 


(5.25) 


. HI  H Hi 


5—14 


Since  = 10,  it  is  easy  to  compute  E(u)  and  Var(u)  from  equations  (5.24) 

and  (5.25).  Thus, 


- (2)  (10)  (10) 
(10  + 10) 


+ 1 = 11 


Var(u)  = 


(2)  (10)  (10)  [(2)  (10)  (10)  - 10  - 10] 
(10  + 10)2  (10  +10-1) 


Var(u)  = 4.7368;  VVar  (u)  = 2.1764 


Now  compute  the  z - statistic,  where 


u - E(u) 
\/Var(u) 


(5.26) 


Thus, 


= 4-11 

Zo  2.17642875 


= - 3.2163 


The  acceptance  region,  for  normal  z and  level  of  significance  a = 0.05,  is, 

- 1.96  < z < + 1.96  . 

Since  zq  < - z^^*  that  is,  since  - 3.2163  < - 1.96,  the  null  hypothesis  that 
the  sample  of  a's  and  the  sample  of  b's  in  Table  5-6  are  random  samples  ex- 
tracted from  the  same  parent  population  density  function  is  rejected.  Stated 
another  way,  since  zq  < - z^^*  nuH  hypothesis  of  randomness,  that  is, 
the  random  recurrence  of  ejection  airspeed  over  time  at  which  a fatality 
occurred,  is  rejected  at  the  a = 0.05  level  of  significance. 


5.3  Relationship  Between  A- 6 Ejection  Velocity 
and  Eiection  Related  Fatalities 

A set  of  tabulated  values  of  ejection  airspeed  ranges  (knots)  vary 
ing  from  0 through  600  knots  was  developed  from  the  MORs.  Values  of  number 
of  ejections  and  number  of  fatalities,  within  the  noted  airspeed  ranges, 
are  given  in  Table  5-8.  Percent  fatalities,  defined  as  the  product  of  one 
hundred  and  the  quotient  of  number  of  fatalities  and  number  of  ejections 
also  is  shown.  A graph  of  these  data  is  shown  in  Figure  5-3.  From  the 
graph,  it  appears  that  a relatively  safe  ejection  region  is  from  200  to 
400  knots.  Numbers  on  the  graph  are  read  as  follows:  (8/2)  = 8 ejections, 

2 of  which  resulted  in  fatalities. 

The  point  indicated  on  the  graph  (Figure  5-3)  represents  all  A-6 
ejections  and  all  A-6  ejection  related  fatalities. 


TABLE  5-8.  A-6  EJECTION  RELATED  FATALITIES  AS  A FUNCTION  OF 

TOTAL  EJECTIONS  AND  EJECTION  AIRSPEED  IN  KNOTS 


Ejection  Airspeed  Range  (Knots) 
(vx  1 v < v2) 

Ejections 

Fatalities 

Percent 

Fatalities 

0-100 

8 

2 

25.00 

100-200 

38 

9 

23.68 

200-300 

34 

1 

2.94 

300-400 

13 

2 

15.38 

400-500 

11 

6 

54.55 

500-600 

3 

1 

33.34 

TOTALS 


19.63 


NUMBERS  ARE  READ  THUS: 


FIGURE  5-3.  PERCENT  A-6  FATALITIES  VERSUS  EJECTION  VELOCITY 


6.0  Hypothesis  Testing 


Hypothesis  testing  is  a statistical  analysis  procedure  which  enables  one 
to  do  the  following:  (1)  Formulate  a hypothesis,  (2)  Test  the  hypothesis,  and 

(3)  Derive  a measure  of  confidence  that  can  be  attached  to  the  results.  In  re- 
stricted circumstances,  hypothesis  testing  can  be  used  as  a predictor  mechanism. 


As  defined  above,  hypothesis  testing  will  be  applied  to  the  following: 

(1)  A-4  ejection  injury  data,  (2)  the  A-6  ejection  injury  data,  and  (3)  ejec- 
tion related  injury  data  derived  from  most  U.S.  Navy  high  performance  aircraft 
over  the  time  period  1 January  1969  through  31  December  1975.  The  hypothesis 
to  be  tested  is  that  the  ejection  injury  data  is  a random  sample  extracted  from 
a given  parent  probability  density  function.  In  this  case,  it  will  be  learned 
that  each  data  set  mentioned  above  is  extracted  from  different  gamma  parent 
density  functions.  The  A-4  data,  as  well  as  the  injury  data  from  seven  U.S. 
Navy  aircraft,  are  data  sets  extracted  from  similar  gamma  parent  density  func- 
tions. The  A-6  data  set,  though  extracted  from  a gamma  parent  density  function 
is  quite  different  from  the  other  two  data  sets.  Goodness-of-f it  of  the  test 
is  achieved  with  the  Chi-square  statistic. 


6.1  An  Analysis  of  the  A-4  Ejection  Related  Injury  Data 


To  apply  hypothesis  testing  to  actual  data,  consider  the  A-4  injury 
data  over  the  years  1969-1975  as  found  in  the  MORs.  The  following  null  hy- 


pothesis is  formulated: 


H : The  A-4  injury  data  is  a random  sample  of  size  241 

extracted  from  a gamma  parent  density  function. 


Injury  data  are  summarized  in  Table  6-1. 


The  gamma  density  function  is 


r(a)eL 


a-1  -x/0  . 0 

ic  e ; x > 0 


(6.1) 


Here,  let  a = 1,  since  a graph  of  the  data  (see  Figure  6-1)  appears  exponen- 
tial in  nature.  Thus,  equation  (6.1)  becomes: 


, , . -ax  1 ^ » 

f(x)  « a e * a “ t * x 0 


(6.2) 


6-1 


TABLE  6-1.  INJURY  PROFILE  FOR  A-A  EJECTIONS 


Injury  Code 

Number  of  Injuries 

Percentage  of 
Total  Injuries 

No/minimal 

109 

45.2 

Minor 

60 

24.9 

Major 

35 

14.5 

Fatal 

29 

12.0 

Lost  and  Unknown 

8 

3.3 

The  numerical  procedure  for  computing  the  constant  a in  equation 
(6.2)  is  summarized  in  Table  6-2.  A simple  arithmetic  averaging  technique 
was  used  to  get  a = 0.46208. 


To  compute  expected  frequency,  equation  (6.2)  is  integrated  between 
definite  limits.  Thus, 


r 

?i  = P(A  < x < B)  = J 


-ax 
a e dx 


F^  = P(A  < x < B)  = e - e aB 


(6.3) 


The  Chi-square  test  with  4 degrees  of  freedom  can  now  be  computed. 


Xc/d.f 


5 <f*  - F,)2 


Ev  i r 

Fi 

i-1 


(6.4) 


where  f^  is  the  observed  frequency  of  injury  occurrence,  and  F^  is  the  expected 
frequency. 

? 

From  column  8 in  Table  6-2,  x u c / * 0.0385497,  or  translated  to 

Ac/d.f.=4  * 

the  241  ejection  injury  base, 


x/d.f  .=*4 


9.2904 
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tM  wi  ti*  BOBS— 


* • 


I 


4 


t 


t 


The  tabulated  value  at  the  significance  level  a = 0.05  is 


2 

XT/d.f.=4;  a=0.05 


9.488 


2 2 

Since  Xc  K XT  » the  null  hypothesis  is  accepted  and  it  can  be  stated  with  95 
percent  confidence  that  this  particular  data  sample  of  A-4  injuries  was  ex- 
tracted from  a parent  density  function  which  is  gamma  distributed. 


This  is  one  example  of  use  of  hypothesis  testing.  A great  many  addi- 
tional hypotheses,  similar  to  the  above  can  be  formulated  from  the  MORs  data. 
Consider  next  the  A-6  ejection  injury  data. 


■ 


!i 


6.2  An  Analysis  of  the  A-6  Election  Related  Injury  Data 

A graph  of  the  A-6  ejection  related  injury  data  is  shown  in  Figure 
6-2.  Inspection  of  Figure  6-2  leads  to  the  following  null  hypothesis: 

H : The  A-6  ejection  related  injury  data  is  a random  sample 

extracted  from  a gamma  parent  probability  density 
function. 

To  test  validity  of  this  hypothesis,  first  compute  a,  and  6 in  the 
gamma  parent  probability  density  function: 

r , 1 a-1  -x/8  „ , , _ , 

f (x)  = x e ; x ^ 0 , (6.5) 

r(a)Ba 

then  compute  theoretical  frequency  using  equation  (6.5),  finally  compute  the 
Chi-square  statistic,  and  compare  the  computed  value  with  the  value  given  at 
the  level  of  significance,  a = 0.05  in  this  case.  Numbers  used  to  compute 
a and  8,  in  equation  (6.5),  are  developed  from  Table  6-3. 
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i 


LEGEND: 


FIGURE  6-2.  HISTOGRAM  OF  A-6  EJECTION  RELATED  INJURY  DATA 


TABLE  6-3.  NUMBERS  DEVELOPED  FROM  A-6  INJURY  DATA 


f*(x) 


0.1869 

0.3552 

0.2523 

0.1402 

0.0654 


1.0000 


- 0.693147 
0.405465 
0.916291 
1.252763 
1.504077 


in  f*(x) 


- 1.677182 

- 1.035074 

- 1.377136 

- 1.964685 

- 2.727233 


Equation  (6.5)  can  be  written  as  follows: 


w = Aq  + A^  +•  A2x2  , 


where 


w = in  f*(x)  ; 


3 |n  x 


Ao  = in  [l/r(a)Ba]  ; A2  = - 1/6 


Aj_  - a - 1 ; 


x2  = x 


Data  from  Table  6-3  permits  the  following  equations  to  be  written: 

- 1.677182  = A - 0.693147  A,  + 0.5  A„  > 

o 12 

- 1.035074  = A + 0.405465  A.  + 1.5  A„ 

o 12 

- 1.377136  = Aq  + 0.916291  A1  + 2.5  A2  ► (6, 

- 1.964685  = Aq  + 1.252763  A1  + 3.5  A2 

- 2.727233  = A + 1.504077  A . + 4.5  A„ 

Average  the  first  three  of  equations  (6.8),  then  average  the  next  three,  and 
finally  average  the  last  three  to  get  the  following  three  linear  equations: 

- 1.363131  = Aq  + 0.209536  Aj  + 1.5  A£ 

- 1.458965  - Aq  + 0.858173  A1  + 2.5  A2  • (6. 


- 2.023018  * Aq  + 1.224377  Ax  + 3.5  A£ 
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Solving  equations  (6.9)  simultaneously  yields 


A = 0.046223 
o 


Ax  = 1.65781 


A2  = - 1.17115 


From  equations  (6.7)  these  constants  become: 


a = 2.65781 


8 = 0.85386 


(6.10) 


(6.11) 


K = [l/r(a)  Ba]  = 1.047308 


From  equations  (6.11),  it  is  seen  that 


T(a)  = 1.453064 


(6.12) 


Now  F (2. 65781)  = 1.65781  T (1.65781),  hence  from  tabulated  values  of  the  gamma 
function 


T(2. 65781)  = 1.494234 


(6.13) 


A mean  value  of  F(a)  is 


T(a)  = (1.453064  + 1.494234)/2 
f (a)  = 1.473649 


(6.14) 


Using  f(ct),  equations  (6.11)  become: 


a = 2.65781 


8 = 0.85386 


(6.15) 


K = [1/f (a)  8 ] = 1.032683275 


Equation  (6.5)  can  now  be  written  thus: 


et  \ i non: oo  1.65781  -1.17115x 

f(x)  = 1.032683  x e 


(6.16) 


1 


Equation  (6.16)  is  the  basic  equation  from  which  theoretical  frequencies  can 
be  obtained. 

Another  method  for  obtaining  the  parameters  a and  6 is  from  the  mo- 
ment generating  function  from  which  it  is  found  that  for  the  gamma  density 
function: 

V = B(a  + 1) 

(6.1 

a2  = /32(a  + 1) 

From  the  data  in  Table  6-3,  estimators  for  a and  6 are: 


a = X ~ s = 2.190075 
s 


6 = — = 0.640128  , 


where 


x = 2.042056 


s = 1.307177 


Two  methods  were  used  to  compute  theoretical  frequency  from  equa- 
tion (6.16):  (1)  a power  series  representation,  thus, 


<lt<b).J_V  (-1)°  (b°+"  - a-*1- 

* rfnl  t—!  = 


r (a)  — , ,,a+n  , 

n=0  (a  + n)  6 n! 


and  (2)  a numerical  integration  scheme  using  the  trapezoidal  rule.  An  example 
of  application  of  the  trapezoidal  rule  is  shown  in  Table  6-4.  Results  from 
both  integration  schemes  are  found  in  Table  6-5. 


~ <■  .-swas-  ' bPIh'l  ITH'~ 
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TABLE  6-4.  NUMERICAL  INTEGRATION  RESULTS  USING 
THE  TRAPEZOIDAL  RULE 


X 

f*(x) 

ffJ(Vl>  + fi<*n”'20 

2.0 

0.3132048 

— 

2.1 

0.3020605 

0.030763 

2.2 

0.2902184 

0.029614 

2.3 

0.2778861 

0.028405 

2.4 

0.2652445 

0.027157 

2.5 

0.2524497 

0.025885 

2.6 

0.2396351 

0.024604 

2.7 

0.2269135 

0.023327 

2.8 

0.2143788 

0.022065 

2.9 

0.2021084 

0.020824 

3.0 

0.1901649 

0.019614 

TOTAL: 

Z = 0.252258 

From  tabulated  values  of  the  Chi-square  statistic, 

2 

XT/d.f.=3;  a=0.05  = 7,81 

The  numerical  value  of  the  Chi-square  statistic  as  computed  by  the  power  series 
expression  is 


*c/d.f.-3;  -0.05  ' 007) (0.0076150) 

xc/d.f.=3;  a=0.05  = °*814805  <Power  Series) 
From  the  Trapezoidal  Rule, 


2 

Xc/d.f.=3;  a=0 . 05 


0.243429 


» i 


Since  x » as  computed  either  by  the  power  series  representation  or  the  trape- 
c 2 

zoidal  rule  is  less  than  Xj>  the  null  hypothesis,  Hq,  is  accepted  and  we  con- 
clude that  with  95  percent  confidence  the  random  sample  of  A-6  ejection  injury 
data  was  extracted  from  a gamma  parent  probability  density  function. 


6.3  An  Analysis  of  Ejection  Related  Injury  Data  from 

Seven  U.S.  Navy  High  Performance  Aircraft 

For  comparison  of  A-6  injury  data  with  injury  data  from  several  U.S. 
Navy  high  performance  aircraft,  an  injury  frequency  histogram  was  constructed, 
and  an  exponential  parent  density  function  hypothesized.  A graphical  display 
is  shown  in  Figure  6-3.  Raw  data,  on  which  the  analysis  is  based,  are  shown 
in  Tables  6-6  and  6-7. 

The  following  null  hypothesis  was  formulated: 


H : The  sample  of  injury  data  from  seven  U.S.  Navy  nigh  per- 

formance aircraft  is  a random  sample  extracted  from  a 
parent  probability  density  function  which  is  exponentially 
distributed. 

The  parent  probability  density  function,  hypothesized  in  the  null  hy- 
pothesis, is  of  the  form: 

f (x)  = a e ax;  x 0;  a > 0 (6.21) 

To  determine  the  parameter  a,  in  the  parent  probability  density  func- 
tion, data  in  Table  6-8  were  used.  From  equation  (6.21)  the  following  rela- 
tionship is  easy  to  obtain: 


where 


w = A 

o 


A1X1 


w = In  f*(x),  Aq  = £n  a 


6-12 


(6.22) 


(6.23) 


6-13 
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FIGURE  6-3.  HISTOGRAM  OF  EJECTION  RELATED  INJURY  DATA  FROM  SEVEN 
U.S.  NAVY  HIGH  PERFORMANCE  AIRCRAFT 


I 


TABLE  6-6.  EJECTION  RELATED  INJURY  DATA  FOR  SEVEN  SELECTED 
U.S.  NAVY  HIGH  PERFORMANCE  AIRCRAFT 


^v^Injury 
1 Aircraft 


No/Minimal  I Minor  I Major  Fatal 


TOTALS 


109 

60 

35 

9 

13 

5 

20 

38 

28 

89 

34 

24 

147 

68 

33 

63 

22 

6 

12 

4 

14 

449 

239 

145 

Lost  and 
Unknown 


Totals 


TABLE  6-7.  NORMALIZED  EJECTION  RELATED  INJURY  DATA  FOR  SEVEN 
SELECTED  U.S.  NAVY  HIGH  PERFORMANCE  AIRCRAFT 


All  A/C 


No /Minimal 

Minor 

Major 

Fatal 

Lost  and 
Unknown 

Total 

Aircraft 

0.4523 

0.2490 

0.1452 

0.1203 

0.0332 

241 

0.2500 

0.3611 

0.1389 

0.1111 

0.1389 

36 

0.1869 

0.3551 

0.2617 

0.1308 

0.0654 

107 

0.5394 

0.2061 

0.1455 

0.0727 

0.0364 

165 

0.5052 

0.2337 

0.1134 

0.0790 

0.0687 

291 

0.6176 

0.2157 

0.0588 

0.0686 

0.0392 

102 

0.3429 

0.1143 

0.4000 

0.1143 

0.0286 

35 

0.4596 

0.2446 

0.1484 

0.0952 

0.0522 

977 

TABLE  6-8.  DATA  USED  TO  COMPUTE  THE  PARAMETER  IN  AN  EXPONENTIAL 
PARENT  PROBABILITY  DENSITY  FUNCTION  FROM  WHICH  THE 
RANDOM  SAMPLE  OF  EJECTION  RELATED  INJURY  DATA 
FOR  SEVEN  AIRCRAFT  WAS  EXTRACTED 


X 

f(x) 

f*(x) 

in  f*(x) 

0.5 

449 

0.4596 

- 0.777399 

1.5 

239 

0.2446 

- 1.408131 

2.5 

145 

0.1484 

- 1.907844 

3.5 

93 

0.0952 

- 2.351775 

4.5 

51 

0.0522 

- 2.952673 

TOTALS : 

977 

1.0000 

— 

The  following  linear  equations  are  an  immediate  consequence  of  Table  6-8  and 
equation  (6.22): 


- 0.777399  = A 

- 0.5a 

0 

- 1.408131  = A 

- 1.5a 

o 

- 1.907844  = A 

- 2.5a 

o 

> 

- 2.351775  = A 

- 3.5a 

o 

- 2.952673  = A 

- 4.5a 

o 


(6.24) 


' J 

Get  an  arithmetic  average  of  the  first  three  equations  in  (6.24)  then  the  last 
three  equations.  Thus: 


1.364458  = A - 1.5a 
o 


2.404017  = A - 3.5a 
o 


(6.25) 
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Solve  these  simultaneously  to  get: 


h, 

"*■1 

; "j 
r ■ 


*.•  • , 

r. 


> 


^ 


a1  - 0.5198195 


A = Hn  a = - 0.58472875  -*  a.  = 0.557256997 
o / 


Let  a = a = (a^  + 


a = 0.5385382 


(6.26) 


Thus,  equation  (6.21)  is  written: 


f (x)  - 0.5385382  e 


-0.5385382x 


(6.27) 


To  get  theoretical  frequencies  over  the  designated  frequency  cells, 
recall  that 


P(A 


< x < B)  = J' 


B 


f(x)  dx 


(6.28) 


x=A 


Substitute  equation  (6.21)  into  (6.28)  to  get: 


P(A  < x < B)  = e aA  - e aB 


(6.29) 


Recall  that 


P(A  < x < B)  = P(0  < x < B)  - P(0  < x < A), 


then  from  equation  (6.29), 


P(0  < x < B)  = 1 - e-aB, 


or  for  the  numbers  being  used: 


P(0  < x < B)  - 1 - e 


-0.5385382B 


(6.30) 


Theoretical  frequencies  are  computed  from  equation  (6.30).  Numerical  values 
are  given  in  Table  6-9. 
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TABLE  6-9.  NUMERICAL  VALUES  OF  OBSERVED  FREQUENCY,  THEORETICAL 
FREQUENCY,  AND  THE  CHI-SQUARE  VALUES 


Interval 

Frequency 

Cell 

Observed 

Frequency 

= fi 

Theoretical 

Frequency 

= Fi 

Chi-Square 
Values  ~ 

■ <fi  - Fi>  /Fi 

0 £ x < 1 

0.4596 

0.416399 

0.004482 

1 <x  < 2 

0.2446 

0.243011 

0.000010 

2 < x < 3 

0.1484 

0.141821 

0.000305 

3 £ x < 4 

0.0952 

0.082767 

0.001869 

4 <_  x <_  5 

0.0522 

0.048303 

0.000314 

TOTALS : 

1.0000 

0.932301 

0.006981 

Translated  to  the  sample  size  of  977  aircraft, 

X2  = (977X0.006981)  = 6.820449. 
c 

The  theoretical  value  of  the  Chi-square  statistic  at  a = 0.05  and  4 degrees  of 
freedom  is 

XT/d.f.=4;  a=0.05  = 9'49* 

2 2 

Since  X£  < X^»  accept  the  null  hypothesis  and  conclude  that  at  the  a = 0.05 
level  of  significance  the  sample  was  a random  sample  of  size  977  extracted 
from  a parent  probability  density  function  which  is  exponentially  distributed. 

A comparison  between  the  A-6  ejection  related  injury  pattern  and 
the  ejection  related  injury  pattern  for  seven  U.S.  Navy  high  performance 
aircraft  is  shown  in  Figure  6-4. 
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FIGURE  6-4 . COMPARISON  OF  THE  A-6  EJECTION  INJURY  PATTERN  WITH  THE 
EJECTION  INJURY  PATTERN  FOR  SEVEN  AIRCRAFT 


7.0  Summary 


This  report  contains  the  results  of  statistical  techniques  applied  to 
ejection  related  injury  data  which  were  developed  as  a consequence  of  ejec- 
tions from  U.S.  Navy  high  performance  aircraft.  Data,  taken  from  the  MORs, 
were  collected  over  the  time  period  1 January  1969  through  31  December  1975. 
The  report  was  derived  over  the  time  period  15  May  1976  through  15  August 
1976. 


The  statistical  tests  applied  to  ejection  related  injury  data  consisted 
of  the  following: 

• Two-Way  Analysis  of  Variance 

• Non-Parametric  Trend  Test 

• Non-Parametric  Run  Test 

• Deterministic  Analyses 

Numerical  Techniques 
Deterministic  Prediction 

• Hypothesis  Testing  using  the  following  parent  probability  density 
functions : 

Runs  Discrete  Density  Function 
— Normal  Density  Function 
Gamma  Density  Function 
Exponential  Density  Function 

• Chi-Square  Density  Function  was  used  to  test  Goodness-of-Fit  of 
random  samples  to  the  preceding  parent  probability  density  functions. 

A summary  of  tests  performed,  results  obtained,  and  confidence  in  the  re- 
sults is  found  in  Table  7-1.  There  it  is  noted  that  underlying  trends  are 
strongly  suspected  to  occur  in  ejection  injury  data  developed  from  the  follow- 
ing U.S.  Navy  high  performance  aircraft:  A-4,  A-6,  A-7,  F-4,  and  F-8.  There 
is  no  reason  to  suspect  a trend  in  the  A-5  and  F-9  ejection  related  injury 
data. 

Use  of  the  run  test  revealed  that  on  a chronological  basis,  fatalities 
occur  randomly  upon  ejection  from  the  A-4,  A-7,  F-4,  and  F-8  aircraft,  but 
they  do  not  occur  randomly  upon  ejection  from  the  A-6. 
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Parent  probability  density  functions  were  discovered  for  ejection  related 
injury  data  from  the  A-6  aircraft  (a  generalized  gamma  density  function),  the 
A-4  (an  exponential  density  function),  and  seven  high  performance  aircraft,  the 
A-4,  A-5,  A-6,  A-7,  F-4,  F-8,  and  F-9,  considered  collectively  as  one  sample 
(the  exponential  density  function) . The  Chi-square  statistic  was  used  to  test 
goodness-of-f it  of  the  sample  data  to  the  parent  density  functions  hypothesized. 

Analysis  of  variance  techniques  were  used  to  detect  differences  among  vari- 
ous ejection  related  fatality  scenarios.  Components  of  the  scenarios  were: 
hardware  hazards,  aircrew  judgment,  environmental  conditions,  high  performance 
aircraft  (A-4,  A-7,  A-6,  F-4,  and  F-8),  and  ejection  seats  (ESCAPAC  and  Martin- 
Baker)  . 

Statistical  investigations  conducted  to  date,  while  informative,  should 
be  categorized  as  initial  investigations  only.  Certain  problem  areas  have  been 
detected.  It  is  believed  that  certain  other  problem  areas  will  be  discovered 
as  the  investigation  continues.  Areas  which  need  immediate  pursuing  are  the 
following : 

• An  in-depth  investigation  of  all  A-6  ejections. 

• Determine  the  underlying  injury  trends  in  the  A-4,  A-6,  A-7,  F-4, 
and  F-8  ejection  related  injury  data. 

• Reconcile  any  apparently  inconsistent  results  detected  by  the  anal- 
ysis of  variance  techniques. 

• After  sufficient  analytical  results  have  been  obtained  from  both  pre- 
liminary and  in-depth  investigations,  the  question  of  reasons  for 
the  particular  ejection  injury  scenario  needs  to  be  addressed. 

• Statistical  and  deterministic  techniques  should  continue  to  be  ap- 
plied to  additional  ejection  data,  so  that  predictive  trends  can 
be  identified  in  ejection  equipment  failure  data  as  well  as  ejec- 
tion related  injury  data. 
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APPENDIX  A 
LIST  OF  SYMBOLS 


A.l  English  Symbols 

A - lower  limit  in  a probability  calculation;  thus,  P(A  < x < B) 

reads  the  probability  that  x lies  between  A and  B;  here  A is 
the  least  value  that  x can  assume. 


- injury  designation  in  the  Medical  Officer's  Reports  indicat- 
ing a fatality. 

- ejection  related  fatality  which  is  attributed  primarily  to  the 
aircrew. 


a numerical  value  of  a data  point  which  exceeds  the  median  of 
the  data  set. 


V Ai 


A0*  Al*  A2 


- constants  in  a polynomial  representation  of  T versus  t. 

- constants  used  in  regression  analysis  to  determine  the  param- 
eter in  an  exponential  parent  probability  density  function. 

- constants  used  in  regression  analysis  to  determine  the  two 
parameters  in  a gamma  parent  probability  density  function. 


ANOVA 

b 


- Analysis  of  Variance. 

- parameter  in  an  exponential  parent  probability  density  func- 
tion. 


1 


upper  limit  in  a probability  calculation.  Thus,  P(A  <_  x £ B), 
read  the  probability  that  x lies  between  A and  B,  here  B is 
the  greatest  value  that  x can  assume. 


f ■ 

‘W’ 

tfe 

r 

K 


A-l 


HH Ml 


I 


► 


*1* 


b 


j 


c 


'1*  2 


C 


j 


E 


e 

E(u) 


F 


- injury  designation  in  the  Medical  Officer's  Reports,  indicat- 
ing a major  injury. 

- constants  to  be  determined  in  an  exponential  equation  (5.10). 

til 

- slope  of  the  j line  segment  . 

- designates  columns  in  an  Analysis  of  Variance  computation. 

- constants  to  be  determined  in  an  exponential  equation  (5.18). 

- sum  of  all  the  row  elements  in  the  j*"*1  column  of  an  Analysis 
of  Variance  matrix. 


- differences  in  iterative  schemes,  for  two  separate  difference 
equations  which  approach  zero  as  the  number  of  iterations 
increases. 

- ejection  related  fatality  which  can  be  attributed  primarily 
to  the  environment. 


- base  of  the  natural  logarithms. 

- mean  of  the  parent  probability  density  function  for  the  random 
variable  u which  represents  the  number  of  runs,  thus: 

u=0,  1,  2,  ...  , n. 

- injury  designation  in  the  Medical  Officer's  Reports,  indicat- 
ing a minor  injury. 

til 

- theoretical  frequency  in  the  i frequency  cell. 

- observed  frequency  in  the  ith  frequency  cell. 

- discrete  parent  density  function  of  the  number  of  runs  u in  a 
non-pat ametrlc  run  test. 


A- 2 


L 


r <«> 


- a dependent  variable,  may  be  a function  of  a discrete  or  con- 
tinuous real  variable  x. 


F (m,n) 
o,c 


theoretical  value  of  the  F-statistic  with  m and  n degrees  of 
freedom.  The  subscript  o,c  means  the  test  is  being  performed 
on  the  null  hypothesis  with  respect  to  ANOVA  columns. 


F (m,n) 
o.c 


F (m,n) 
o,r 


value  of  the  F-statistic  computed  from  data,  with  m and  n de- 
grees of  freedom;  used  when  testing  a null  hypothesis  with 
respect  to  columns  in  an  ANOVA  matrix. 

theoretical  value  of  the  F-statistic  with  m and  n degrees  of 
freedom;  used  when  testing  a null  hypothesis  with  respect  to 
rows  in  an  ANOVA  matrix. 


F (m,n) 
°,r 


- value  of  the  F-statistic  computed  from  data,  with  m and  n de- 
grees of  freedom;  used  when  testing  a null  hypothesis  with 
respect  to  rows  in  an  ANOVA  matrix. 

- sum  of  all  the  elements  in  an  Analysis  of  Variance  matrix. 

- injury  designation  in  the  Medical  Officer's  Reports  indicating 
no  injury  or  minimal  injury. 

- ejection  related  fatality  which  can  be  attributed  primarily 
to  hardware. 

- a null  hypothesis  to  be  tested. 

- null  hypothesis  with  respect  to  the  columns  in  an  ANOVA  matrix. 

- null  hypothesis  with  respect  to  the  rows  in  an  ANOVA  matrix. 

- an  integer  representation  in  a computer  flow  diagram;  so  also 
is  J,  K,  L,  M,  and  N. 


six’'-'  '.Jk.  — _ ‘ .1N*SL: 


index  of  summation;  can  be  used  as  a discrete  counting 
variable. 


index  of  summation;  can  be  used  as  a discrete  counting 
variable. 

a constant  defined  as  follows:  K = [fja  T(a)]  used  to 

determine  the  parameters  in  a gamma  parent  probability 
density  function. 

an  index  of  summation. 


a parameter  in  the  discrete  parent  probability  density 
function  of  the  number  of  runs  in  a non-parametric  run 
test.  Thus,  u = 2k  if  u is  an  even  number  of  runs,  and 
u = 2k  + 1 if  u is  an  odd  number  of  runs. 


injury  designation  in  the  Medical  Officer's  Reports;  indi- 
cates an  aircrewman  that  was  lost,  that  is,  his  body  was 
not  recovered. 

represents  a line  segment  over  the  j*"*1  time  interval; 
j - 1,  2,  3,  4. 


slope  of  the  j*"*1  line  segment  L 


j' 


Medical  Officer's  Reports. 


number  of  data  elements  in  the  sample  under  a trend  test 
investigation. 


lower  limit  for  the  number  of  trends  in  the  null  hypothesis 
acceptance  region  of  a non-parametric  trend  test. 

upper  limit  for  the  number  of  trends  in  the  null  hypothesis 
acceptance  region  of  a non-parametric  trend  test. 


A- 4 


P(A  < x < B) 


- number  of  trends  computed  in  a non-parametric  trend  test. 

The  subscript  c represents  computed. 

- index  of  summation. 

- the  sum  of  all  elements  of  a given  designation  in  a dichoto- 
mous (0,  1;  or  a,  b)  non-parametric  run  test;  say  the  sum  of 
all  the  l's  or  all  the  a's. 

- the  sum  of  all  elements  of  a given  designation  in  a dichoto- 
mous (0,  1;  or  a,  b)  non-parametric  run  test;  say  the  sum  of 
all  the  0's  or  all  the  b's. 

- total  number  of  data  points  in  the  jth  time  interval. 

- the  probability  that  x lies  between  the  lower  limit  A and  the 
upper  limit  B. 


i . 

« 


*; 

I . 

4 ' 

m 

t ’i 

••  i 

j 

r ’ 

frl  ! 

*'■  : 


SAR 

SSC 

SSE 


- designates  rows  in  an  Analysis  of  Variance  computation. 

- sum  of  all  the  column  elements  in  the  iC^  row  of  an  Analysis 
of  Variance  matrix. 


an  unbiased  estimate  of  population  variance  defined  thus: 
2 


n , -.2 

ys  (x±  - x) 


H n - 1 


- Search  And  Rescue. 

- sum  of  squares  on  the  column  elements  in  an  Analysis  of  Vari- 
ance matrix. 

- sum  of  the  squares  of  the  errors  in  an  Analysis  of  Variance 
matrix. 
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rt  II 


SSR 


SST 


T. 

J 


t 

e 


U 


u 


- sum  of  squares  on  the  row  elements  in  an  Analysis  of  Variance 
matrix. 


- total  sum  of  squares  of  the  elements  in  an  Analysis  of  Vari- 
ance matrix. 

- total  fatalities  at  the  ith  data  point. 

- arithmetic  mean  of  the  number  of  fatalities  in  the  j1'*1  time 
interval;  sometimes  written  as  TL  . 

j 

- time  coordinate  of  the  iC^  data  point  at  which  a total  of 
fatalities  was  experienced. 

- termination  time  for  fatalities  clustered  about  line  segment 


- time  length  of  line  segment  L^;  thus,  t^  = t - t . 

- starting  time  for  fatalities  clustered  about  line  segment  L^. 

- arithmetic  mean  of  time  values  in  the  j ^ time  interval; 
sometimes  written  as  t . 

Lj 

- the  arithmetic  mean  of  successive  values  of  t,  where 


- injury  designation  in  the  Medical  Officer's  Reports  indicating 
that  an  aircrewman  was  lost,  but  it  is  not  known  how  or  where 
he  was  lost;  the  loss  is  thus  unknown. 

- number  of  runs  in  a non-parametric  run  test. 
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- a positive  integer  such  that 


2 f(u)  < o 
u=0 


Var  (u) 


v 


- Variance  of  the  parent  probability  density  function  from  which 
the  random  sample  of  u-runs  was  extracted. 

- shorthand  notation  for  fcn  f*(x),  where 


f*(x  ) = f(x  )/^f(x.) 

i 

- an  independent  real  variable;  may  be  discrete  or  continuous. 

- mid-point  of  the  iC^  frequency  class  interval. 

- the  i*”*1  element  in  the  set  of  all  x's. 


x(I) 

Xi,j 


- the  element  in  the  set  of  all  x's. 


computer  flow  diagram  notation  for  x^. 

element  found  at  the  intersection  of  the  i1"*1  row  and 
column  in  an  Analysis  of  Variance  matrix. 


sample  mean  defined  thus, 
n 


x = ^ x./n. 


i-1 


1 f ' , 

* • 


til 

- observed  frequency  in  the  l frequency  class  interval. 

- normalized  value  of  frequency  in  the  iC^  frequency  class 
interval . 


\U 

;;  •; 
i r 

H \ 
-•  - 

i 
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- a standardized  value,  computed  from  given  data,  of  an  inde- 
pendent variable,  assumed  here  to  be  normally  distributed, 
and  defined  as  follows: 

z = [u  - E(u)]  / \/ Var  (u) ; 
c 

it  is  sometimes  designated  as  zn.  Here,  c represents  computed. 


- a value  of  z such  that 


-z 

L 


f(z)  dz  = a; 


where  f(z),  here  assumed  to  be  continuous,  is  the  parent 
probability  density  function  of  the  real  continuous  vari- 
able z. 


A.  2 Greek  Symbols 


r (a) 


- level  of  significance  in  the  test  of  an  hypothesis,  equals  the 
probability  of  a Type  I error. 

- a parameter  in  the  gamma  parent  probability  density  function. 

- an  estimator  of  the  parameter  a in  the  gamma  parent  prob- 
ability density  function. 

- a parameter  in  the  gamma  parent  probability  density  function. 

- an  estimator  of  the  parameter  6 in  the  gamma  parent  prob- 
ability density  function. 

- the  gamma  function  of  the  argument  a,  where  a > 0 represents 
a real  number. 


r (a) 


- the  arithmetic  mean  of  two  values  of  T(a). 


mmm 


I 


- time  distance  between  successive  values  of  t. 

- mean  of  the  parent  probability  density  function  for  the  real 
variable  x.  A similar  definition  holds  for  p . 

y 

- standard  deviation  of  the  parent  probability  density  func- 
tion for  the  real  variable  x.  A similar  definition  holds 
for  o . 

y 

- a computed  value  of  the  Chi-square  statistic  with  four 
degrees  of  freedom. 

- the  theoretical  value  of  the  Chi-square  statistic  with  four 
degrees  of  freedom. 

- natural  logarithm  of  the  normalized  value  of  y^,  where 


y?  = y<  / y^  ; 

i=l 

Jl n y*  is  sometimes  written  jin  f*(x^). 
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DEFINITIONS  OF  SELECTED  STATISTICAL  TERMS 


All  definitions  which  follow  are  those  for  expressions  contained  herein. 

They  are  given  to  assist  the  reader  as  he  studies  this  report. 

Definitions 

1.  Class  Interval — an  interval  within  which  persons,  places  or  things  hav- 
ing a prescribed  quantified  attribute  can  be  found. 

Example : A study  is  to  be  made  of  a sample  of  100  persons  residing 

in  Fairfax  County,  Virginia,  having  homes  which  range  in  value  from 
$50,000  to  $150,000.  The  study  is  to  be  sub-divided  into  housing 
value  increments  of  $10,000.  The  first  class  interval  in  the  study 
consists  of  all  persons  in  the  100  person  sample  having  homes  which 
vary  in  value  from  $50,000  to  $59,999.99. 

2.  Confidence  Coefficient — the  confidence  that  can  be  ascribed  to  an  attri- 
bute which  a parameter  in  a given  parent  probability  density  function 
can  have. 

Example : IfP  (3<p<4)=0.95,  then  for  a large  number  of  random 

samples  extracted  from  a given  parent  probability  density  function, 
the  true  parent  mean  will  lie  between  3 and  4 about  95  percent  of  the 

time.  The  interval  3 < p < 4 is  called  the  95  percent  confidence 

interval. 

3.  Continuous  Probability  Density  Function — Define  this  probability  density 

function  by  f(x).  Then,  f(x)  represents  the  domain  of  a continuous  real 
variable  x,  where  the  range  of  x is  the  infinite  interval,  - ® < x < + ®, 

a semi-infinite  interval  0 <_  x < + «,  or  a finite  interval  a <_  x <_  b. 

Further,  f(x)  is  assumed  to  possess  all  other  properties  which  make  it  a 
valid  continuous  probability  density  function. 

Examples  of  Continuous  Probability  Density  Functions:  The  normal , 

t,  F,  Chi-square,  gamma,  exponential,  and  beta. 


A.  Discrete  Probability  Density  Functions — Define  this  probability  density 
function  by  g(x).  Then  g(x)  represents  the  domain  of  a discrete  real 
variable  x,  where  x can  only  assume  the  discrete  values  h,  2h,  3h,  ...  , nh 
for  some  real  number  h.  Further,  g(x)  is  assumed  to  possess  all  other 
properties  which  make  it  a valid  discrete  probability  density  function. 

Examples  of  Discrete  Probability  Density  Functions:  The  binomial, 

multinomial,  Poisson,  hypergeometric,  negative  binomial,  and  runs 
density  function  associated  with  a non-parametric  run  test. 

5.  Estimator — a number  developed  from  a random  sample  which  estimates  a pa- 
rameter in  a parent  probability  density  function.  This  is  sometimes 
called  estimate  or  sample  statistic. 

Example : The  sample  mean,  x,  is  an  estimator  for  the  parent  popula- 

tion mean,  p. 

6.  Frequency,  Observed — the  number  of  times  which  persons,  places  or  things, 
having  a prescribed  quantified  attribute,  are  observed  to  appear  in  a 
given  class  interval. 

Example : If  ten  persons  own  homes  in  the  price  range  from  $50,000 

to  $59,999.99  (see  definition  of  Class  Interval  above),  then  the 
observed  frequency  in  the  first  class  interval  is  ten. 

7.  Frequency,  Theoretical — the  number  of  times  which  persons,  places,  or 
things,  having  a prescribed  quantified  attribute,  would  be  expected  to 
appear  in  a given  class  interval.  This  value  is  determined  by  evaluating 
the  area,  under  the  parent  probability  density  function,  over  the  class 
interval. 

Example ; If  the  random  sample  of  definition  1 above  is  extracted 
from  a continuous  parent  probability  density  function,  then,  by 
definition: 

/ 59, 999. 99 

f(x)  dx, 

50,000 


B-2 


where  is  the  theoretical  frequency  in  the  first  class  interval. 
Thus,  F^  might  not  be  a positive  integer,  as  is  required  for  the 
observed  frequency. 

Histogram — a bar  graph  or  chart  depicting  observed  frequency  versus  class 
interval  for  the  set  of  class  intervals  which  span  the  range  of  the  data 
sample  being  studied. 

Hypothesis — a generalized  statement  made  about  a parent  population,  the 
plausibility  of  which  is  to  be  tested  (within  confidence  limits)  by  an 
analysis  of  a random  sample,  or  analyses  of  random  samples. 

Example : Refer  to  definition  1 above  and  formulate  the  following 

null  hypothesis,  denoted  by  H^: 

Hg  : The  average  value  of  all  homes  in  the  parent  population  (see 
definition  1 above)  is  $80,000. 

An  alternate  hypothesis  could  be  the  following: 

H^  : The  average  value  of  all  homes  in  the  parent  population  (see 
definition  1 above)  is  greater  than  $80,000. 

Level  of  Significance — this  is  used  to  establish  the  confidence  which  can 
be  associated  with  the  rejection  of  a given  hypothesis.  This  quantity  is 
usually  denoted  by  a. 

Example : If  the  null  hypothesis,  Hq,  (see  definition  9 above)  is 

correct  at  the  a = 0.05  level  of  significance  for  a large  number 
of  random  samples  drawn  from  a parent  probability  density  function, 
it  can  be  expected,  on  the  average,  that  95  percent  of  these  will 
have  a mean  value  between  $75,000  and  $85,000. 

Mean — this  is  a measure  of  central  tendency  either  in  a parent  probabil- 
ity density  function  or  in  a random  sample.  The  mean,  p,  of  a parent 
probability  density  function  is  estimated  by  the  mean  x of  a random  sam- 
ple extracted  from  the  parent  density  function. 


12.  Median — a measure  of  central  tendency  in  a parent  population  or  in  a 
sample  such  that  there  are  as  many  values  above  It  as  there  are  below. 

Example : Given  the  random  sample  of  size  7 as  follows:  1.1,  2.2, 
3.2,  4.6,  5.6,  6.5,  7.3,  the  number  4.6  is  the  median  of  this  sample 
since  there  are  three  numbers  in  the  sample  less  than  4.6,  and  three 
numbers  greater  than  4.6.  As  a matter  of  passing  interest,  the  mean 
of  the  above  sample  is  approximately  4.357. 

13.  Parameter — an  unknown  constant  in  a parent  probability  density  function 
that  can  be  estimated  from  an  analysis  of  data  contained  in  a random 
sample . 

Example : If  a parent  probability  density  function  is  defined  by  the 

exponential  equation  f(x)  = be  ^X,  b is  the  unknown  parameter  in 
that  equation  which  may  be  estimated  from  sample  data. 

14.  Parent  Probability  Density  Function — a mathematical  function,  discrete  or 
continuous,  which  describes  probability  for  specific  values  or  ranges  of 
values,  of  the  designated  independent  variate. 

Example : The  normal  probability  density  function,  defined  thus: 


V5tT  a 


can  be  viewed  as  a parent  probability  density  function. 

15.  Run — In  a given  random  sample  consisting  of  a dichotomous  data  set  (such 
as  0's  and  l's  or  a's  and  b's),  a run  is  a sequence  of  elements  of  one 
kind  followed  by  elements  of  the  other  kind,  or  followed  by  no  elements 
at  all. 

Example : If  a random  sample  consists  of  the  following  data  set: 

a,  a,  a,  b,  b,  a,  b , a,  a b ^ 
the  set  consists  of  six  runs. 

16.  Standard  Deviation — one  measure  of  dispersion  of  data  about  the  mean 
value.  It  is  the  square  root  of  the  variance. 
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Trend — an  expression  used  in  non-par ame trie  trend  analysis  defined  as  fol- 
lows: Given  a data  set  x, , x„,  ...  , x.,  x.,.,  ...  , x . Start  with  x,, 

and  compare  it  with  all  other  elements  in  the  data  set.  Count  the  number 
of  times  x^  > x^,  j = 2,  3,  ...  , n.  Repeat  using  x x^,  ...  , xn_j- 
Get  the  sum  of  all  such  "trends"  in  the  data  set.  An  inference  about 
an  underlying  trend  in  the  data  set  can  be  made  from  the  sum  of  all  the 
inequalities  x^  > x^  for  i < j,  where  i = 1,  2,  ...  , n. 

Variance — a measure  of  dispersion  about  the  mean.  This  can  be  a measure 

of  dispersion  about  the  mean  of  a known  parent  probability  density  func- 

2 

tion,  in  which  case  it  is  denoted  by  a — the  variance  of  the  parent  prob- 
ability density  function.  It  can  also  be  a measure  of  dispersion  about 

the  mean  of  a random  sample,  in  which  case  the  sample  variance  is  denoted 

, 2 
by  s . 
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